Bioinformatics tools for analysing viral genomic data.

نویسندگان

  • R J Orton
  • Q Gu
  • J Hughes
  • M Maabar
  • S Modha
  • S B Vattipally
  • G S Wilkie
  • A J Davison
چکیده

The field of viral genomics and bioinformatics is experiencing a strong resurgence due to high-throughput sequencing (HTS) technology, which enables the rapid and cost-effective sequencing and subsequent assembly of large numbers of viral genomes. In addition, the unprecedented power of HTS technologies has enabled the analysis of intra-host viral diversity and quasispecies dynamics in relation to important biological questions on viral transmission, vaccine resistance and host jumping. HTS also enables the rapid identification of both known and potentially new viruses from field and clinical samples, thus adding new tools to the fields of viral discovery and metagenomics. Bioinformatics has been central to the rise of HTS applications because new algorithms and software tools are continually needed to process and analyse the large, complex datasets generated in this rapidly evolving area. In this paper, the authors give a brief overview of the main bioinformatics tools available for viral genomic research, with a particular emphasis on HTS technologies and their main applications. They summarise the major steps in various HTS analyses, starting with quality control of raw reads and encompassing activities ranging from consensus and de novo genome assembly to variant calling and metagenomics, as well as RNA sequencing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

QUENTIN: reconstruction of disease transmissions from viral quasispecies genomic data

Motivation Genomic analysis has become one of the major tools for disease outbreak investigations. However, existing computational frameworks for inference of transmission history from viral genomic data often do not consider intra-host diversity of pathogens and heavily rely on additional epidemiological data, such as sampling times and exposure intervals. This impedes genomic analysis of outb...

متن کامل

eSAGE: managing and analysing data generated with Serial Analysis of Gene Expression (SAGE)

SUMMARY eSAGE is a comprehensive set of software tools for managing and analysing data generated with Serial Analysis of Gene Expression (SAGE).

متن کامل

An Integrative Bioinformatics Approach for Knowledge Discovery

The vast amount of data being generated by large scale omics projects and the computational approaches developed to deal with this data have the potential to accelerate the advancement of our understanding of the molecular basis of genetic diseases. This better understanding may have profound clinical implications and transform the medical practice; for instance, therapeutic management could be...

متن کامل

Applying Agents to Bioinformatics in GeneWeaver

Recent years have seen dramatic and sustained growth in the amount of genomic data being generated, including in late 1999 the first complete sequence of a human chromosome. The challenge now faced by biological scientists is to make sense of this vast amount of accumulated and accumulating data. Fortunately, numerous databases are provided as resources containing relevant data, and there are s...

متن کامل

PUMA2—grid-based high-throughput analysis of genomes and metabolic pathways

The PUMA2 system (available at http://compbio.mcs.anl.gov/puma2) is an interactive, integrated bioinformatics environment for high-throughput genetic sequence analysis and metabolic reconstructions from sequence data. PUMA2 provides a framework for comparative and evolutionary analysis of genomic data and metabolic networks in the context of taxonomic and phenotypic information. Grid infrastruc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Revue scientifique et technique

دوره 35 1  شماره 

صفحات  -

تاریخ انتشار 2016